智能论文笔记

Neural Implicit Event Generator for Motion Tracking

Mana Masuda , Yusuke Sekikawa , Ryo Fujii , Hideo Saito

分类：计算机视觉

2021-11-06

我们使用隐式表达式从事件数据提出了一部新颖的运动跟踪框架。我们的框架使用预先训练的事件生成MLP命名为隐式事件生成器（IEG），并且通过基于从当前状态估计的所观察到的事件和生成的事件之间的差异来更新其状态（位置和速度）来进行运动跟踪。差异由IEG隐式计算。与传统的显式方法不同，需要密集的计算来评估差异，我们的隐式方法直接从稀疏事件数据实现有效状态更新。我们的稀疏算法特别适用于计算资源和电池寿命有限的移动机器人应用。为了验证我们对现实数据的方法的有效性，我们将其应用于AR标记跟踪应用程序。我们已经证实，我们的框架在噪音和背景混乱存在下的现实环境中运作良好。

translated by 谷歌翻译

Attention in a family of Boltzmann machines emerging from modern Hopfield networks

Toshihiro Ota , Ryo Karakida

分类：机器学习 | 神经与进化计算 | (统计)机器学习

2022-12-09

Hopfield networks and Boltzmann machines (BMs) are fundamental energy-based neural network models. Recent studies on modern Hopfield networks have broaden the class of energy functions and led to a unified perspective on general Hopfield networks including an attention module. In this letter, we consider the BM counterparts of modern Hopfield networks using the associated energy functions, and study their salient properties from a trainability perspective. In particular, the energy function corresponding to the attention module naturally introduces a novel BM, which we refer to as attentional BM (AttnBM). We verify that AttnBM has a tractable likelihood function and gradient for a special case and is easy to train. Moreover, we reveal the hidden connections between AttnBM and some single-layer models, namely the Gaussian--Bernoulli restricted BM and denoising autoencoder with softmax units. We also investigate BMs introduced by other energy functions, and in particular, observe that the energy function of dense associative memory models gives BMs belonging to Exponential Family Harmoniums.

translated by 谷歌翻译

Achieving Transparency in Distributed Machine Learning with Explainable Data Collaboration

Anna Bogdanova , Akira Imakura , Tetsuya Sakurai , Tomoya Fujii , Teppei Sakamoto , Hiroyuki Abe

分类：机器学习 | 人工智能

2022-12-06

Transparency of Machine Learning models used for decision support in various industries becomes essential for ensuring their ethical use. To that end, feature attribution methods such as SHAP (SHapley Additive exPlanations) are widely used to explain the predictions of black-box machine learning models to customers and developers. However, a parallel trend has been to train machine learning models in collaboration with other data holders without accessing their data. Such models, trained over horizontally or vertically partitioned data, present a challenge for explainable AI because the explaining party may have a biased view of background data or a partial view of the feature space. As a result, explanations obtained from different participants of distributed machine learning might not be consistent with one another, undermining trust in the product. This paper presents an Explainable Data Collaboration Framework based on a model-agnostic additive feature attribution algorithm (KernelSHAP) and Data Collaboration method of privacy-preserving distributed machine learning. In particular, we present three algorithms for different scenarios of explainability in Data Collaboration and verify their consistency with experiments on open-access datasets. Our results demonstrated a significant (by at least a factor of 1.75) decrease in feature attribution discrepancies among the users of distributed machine learning.

translated by 谷歌翻译

Location analysis of players in UEFA EURO 2020 and 2022 using generalized valuation of defense by estimating probabilities

Rikuhei Umemoto , Kazushi Tsutsui , Keisuke Fujii

分类：机器学习

2022-11-30

Analyzing defenses in team sports is generally challenging because of the limited event data. Researchers have previously proposed methods to evaluate football team defense by predicting the events of ball gain and being attacked using locations of all players and the ball. However, they did not consider the importance of the events, assumed the perfect observation of all 22 players, and did not fully investigated the influence of the diversity (e.g., nationality and sex). Here, we propose a generalized valuation method of defensive teams by score-scaling the predicted probabilities of the events. Using the open-source location data of all players in broadcast video frames in football games of men's Euro 2020 and women's Euro 2022, we investigated the effect of the number of players on the prediction and validated our approach by analyzing the games. Results show that for the predictions of being attacked, scoring, and conceding, all players' information was not necessary, while that of ball gain required information on three to four offensive and defensive players. With game analyses we explained the excellence in defense of finalist teams in Euro 2020. Our approach might be applicable to location data from broadcast video frames in football games.

translated by 谷歌翻译

Composition, Attention, or Both?

Ryo Yoshida , Yohei Oseki

分类：自然语言处理

2022-10-24

In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components -- the composition function and the self-attention mechanism -- can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.

translated by 谷歌翻译

Data Augmentation by Selecting Mixed Classes Considering Distance Between Classes

Shungo Fujii , Yasunori Ishii , Kazuki Kozuka , Tsubasa Hirakawa , Takayoshi Yamashita , Hironobu Fujiyoshi

分类：计算机视觉 | (统计)机器学习

2022-09-12

数据增强是使用深度学习来提高对象识别的识别精度的重要技术。从多个数据集中产生混合数据（例如混音）的方法可以获取未包含在培训数据中的新多样性，从而有助于改善准确性。但是，由于在整个训练过程中选择了选择用于混合的数据，因此在某些情况下未选择适当的类或数据。在这项研究中，我们提出了一种数据增强方法，该方法根据班级概率来计算类之间的距离，并可以从合适的类中选择数据以在培训过程中混合。根据每个班级的训练趋势，对混合数据进行动态调整，以促进培训。所提出的方法与常规方法结合使用，以生成混合数据。评估实验表明，提出的方法改善了对一般和长尾图像识别数据集的识别性能。

translated by 谷歌翻译

Black-box optimization for integer-variable problems using Ising machines and factorization machines

Yuya Seki , Ryo Tamura , Shu Tanaka

分类：机器学习

2022-09-01

黑盒优化在许多应用中具有潜力，例如在实验设计中的机器学习和优化中的超参数优化。 ISING机器对二进制优化问题很有用，因为变量可以由Ising机器的单个二进制变量表示。但是，使用ISING机器的常规方法无法处理具有非二进制值的黑框优化问题。为了克服这一限制，我们通过与三种不同的整数编码方法合作，通过使用ISING/退火计算机和分解计算机来提出一种用于整数变量的黑盒优化问题的方法。使用不同的编码方法，使用一个简单的问题来计算最稳定状态下的氢分子能量，以不同的编码方法进行数值评估。提出的方法可以使用任何整数编码方法来计算能量。但是，单次编码对于小尺寸的问题很有用。

translated by 谷歌翻译

Non-readily identifiable data collaboration analysis for multiple datasets including personal information

Akira Imakura , Tetsuya Sakurai , Yukihiko Okada , Tomoya Fujii , Teppei Sakamoto , Hiroyuki Abe

分类：机器学习

2022-08-31

多源数据融合，共同分析了多个数据源以获得改进的信息，引起了广泛的研究关注。对于多个医疗机构的数据集，数据机密性和跨机构沟通至关重要。在这种情况下，数据协作（DC）分析通过共享维数减少的中间表示，而无需迭代跨机构通信可能是合适的。在分析包括个人信息在内的数据时，共享数据的可识别性至关重要。在这项研究中，研究了DC分析的可识别性。结果表明，共享的中间表示很容易识别为原始数据以进行监督学习。然后，这项研究提出了一个非可读性可识别的直流分析，仅共享多个医疗数据集（包括个人信息）的非可读数据。所提出的方法基于随机样本排列，可解释的直流分析的概念以及无法重建的功能的使用来解决可识别性问题。在医学数据集的数值实验中，提出的方法表现出非可读性可识别性，同时保持了常规DC分析的高识别性能。对于医院的数据集，提出的方法在仅使用本地数据集的本地分析的识别性能方面表现出了9个百分点的改善。

translated by 谷歌翻译

HTML版本

Automatic detection of faults in race walking from a smartphone camera: a comparison of an Olympic medalist and university athletes

Tomohiro Suzuki , Kazuya Takeda , Keisuke Fujii

分类：计算机视觉 | 人工智能 | 机器学习

2022-08-24

自动故障检测是许多运动的主要挑战。在比赛中，裁判根据规则在视觉上判断缺点。因此，在判断时确保客观性和公平性很重要。为了解决这个问题，一些研究试图使用传感器和机器学习来自动检测故障。但是，与传感器的附件和设备（例如高速摄像头）相关的问题，这些问题与裁判的视觉判断以及故障检测模型的可解释性相抵触。在这项研究中，我们提出了一个用于非接触测量的断层检测系统。我们使用了根据多个合格裁判的判断进行训练的姿势估计和机器学习模型，以实现公平的错误判断。我们使用智能手机视频在包括东京奥运会的奖牌获得者中，使用了正常比赛的智能手机视频，并有意地走路。验证结果表明，所提出的系统的平均准确度超过90％。我们还透露，机器学习模型根据种族步行规则检测到故障。此外，奖牌获得者的故意故障步行运动与大学步行者不同。这一发现符合更通用的故障检测模型的实现。该代码和数据可在https://github.com/szucchini/racewalk-aijudge上获得。

translated by 谷歌翻译

RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling

Jian Liao , Adnan Karim , Shivesh Jadon , Rubaiat Habib Kazi , Ryo Suzuki

分类：自然语言处理

2022-08-12

我们介绍RealityTalk，该系统通过语音驱动的互动虚拟元素来增强实时实时演示。增强演示文稿利用嵌入式视觉效果和动画来吸引和表现力。但是，现有的实时演示工具通常缺乏互动性和即兴创作，同时在视频编辑工具中产生这种效果需要大量的时间和专业知识。RealityTalk使用户能够通过实时语音驱动的交互创建实时增强演示文稿。用户可以通过实时语音和支持方式进行交互提示，移动和操纵图形元素。根据我们对177个现有视频编辑的增强演示文稿的分析，我们提出了一套新颖的互动技术，然后将它们纳入真人秀。我们从主持人的角度评估我们的工具，以证明系统的有效性。

translated by 谷歌翻译